Search CORE

172 research outputs found

Paronyms for Accelerated Correction of Semantic Errors

Author: Bolshakov Igor
Gelbukh Alexander
Publication venue: Institute of Information Theories and Applications FOI ITHEA
Publication date: 01/01/2003
Field of study

* Work done under partial support of Mexican Government (CONACyT, SNI), IPN (CGPI, COFAA) and Korean Government (KIPA Professorship for Visiting Faculty Positions). The second author is currently on Sabbatical leave at Chung-Ang University.The errors usually made by authors during text preparation are classified. The notion of semantic errors is elaborated, and malapropisms are pointed among them as “similar” to the intended word but essentially distorting the meaning of the text. For whatever method of malapropism correction, we propose to beforehand compile dictionaries of paronyms, i.e. of words similar to each other in letters, sounds or morphs. The proposed classification of errors and paronyms is illustrated by English and Russian examples being valid for many languages. Specific dictionaries of literal and morphemic paronyms are compiled for Russian. It is shown that literal paronyms drastically cut down (up to 360 times) the search of correction candidates, while morphemic paronyms permit to correct errors not studied so far and characteristic for foreigners

Bulgarian Digital Mathematics Library at IMI-BAS

PolyHope: Two-Level Hope Speech Detection from Tweets

Author: Balouchzahi Fazlourrahman
Gelbukh Alexander
Sidorov Grigori
Publication venue
Publication date: 03/11/2022
Field of study

Hope is characterized as openness of spirit toward the future, a desire, expectation, and wish for something to happen or to be true that remarkably affects human's state of mind, emotions, behaviors, and decisions. Hope is usually associated with concepts of desired expectations and possibility/probability concerning the future. Despite its importance, hope has rarely been studied as a social media analysis task. This paper presents a hope speech dataset that classifies each tweet first into "Hope" and "Not Hope", then into three fine-grained hope categories: "Generalized Hope", "Realistic Hope", and "Unrealistic Hope" (along with "Not Hope"). English tweets in the first half of 2022 were collected to build this dataset. Furthermore, we describe our annotation process and guidelines in detail and discuss the challenges of classifying hope and the limitations of the existing hope speech detection corpora. In addition, we reported several baselines based on different learning approaches, such as traditional machine learning, deep learning, and transformers, to benchmark our dataset. We evaluated our baselines using weighted-averaged and macro-averaged F1-scores. Observations show that a strict process for annotator selection and detailed annotation guidelines enhanced the dataset's quality. This strict annotation process resulted in promising performance for simple machine learning classifiers with only bi-grams; however, binary and multiclass hope speech detection results reveal that contextual embedding models have higher performance in this dataset.Comment: 20 pages, 9 figure

arXiv.org e-Print Archive

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Automatic Estimation of Parameters of Complex Fuzzy Control Systems

Author: Alexander Gelbukh
Rene Garcia Hernandez
Yulia Ledeneva
Publication venue: 'IntechOpen'
Publication date: 01/10/2008
Field of study

IntechOpen

Recent Trends in Deep Learning Based Personality Detection

Author: Cambria Erik
Gelbukh Alexander
Majumder Navonil
Mehta Yash
Publication venue
Publication date: 27/08/2019
Field of study

Recently, the automatic prediction of personality traits has received a lot of attention. Specifically, personality trait prediction from multimodal data has emerged as a hot topic within the field of affective computing. In this paper, we review significant machine learning models which have been employed for personality detection, with an emphasis on deep learning-based methods. This review paper provides an overview of the most popular approaches to automated personality detection, various computational datasets, its industrial applications, and state-of-the-art machine learning models for personality detection with specific focus on multimodal approaches. Personality detection is a very broad and diverse topic: this survey only focuses on computational approaches and leaves out psychological studies on personality detection

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Representación computacional del lenguaje natural escrito

Author: Gelbukh Alexander
Ordoñez Salinas Sonia
Publication venue: 'Universidad Distrital Francisco Jose de Caldas'
Publication date: 01/01/2010
Field of study

When humans read, or hear, words, they immediately relatethem to a concept. This is possible due to the informationalready stored in the brain and also to human’s ability toselect, process, and associate such information with words.However, for a computer, natural language text is only asequence of bits that does not convey any meaning on itsown, unless properly processed. A computer interprets thisbit sequence by modeling the processing that takes place inhuman minds, namely structuring and linking the text withpreviously stored information. During this process, as wellas when describing its results, the text is represented usingvarious formal structures that permit automatic processing,interpretation, and comparison of information. In this paper,we present a detailed description of these structures.Cuando el ser humano lee o escucha una palabra, inmediatamente la relaciona con un concepto. Esto es posible gracias a la acumulación de información y a la posibilidad de filtrar, procesar y relacionar dicha información. Para la máquina, una expresión escrita en el lenguaje natural es una cadena de bits que no aporta información por sí sola. Un computador interpreta esta cadena de bits, modelando el proceso que tiene lugar en la mente humana, estructurando y relacionado la cadena con información previamente almacenada. En el proceso, así como al momento de describir los resultados, el texto es representado por estructuras formales que permiten el procesamiento automático, la interpretación y la comparación de la información. Este artículo presenta una descripción detallada de estas estructuras

Universidad Distrital de la ciudad de Bogotá: Open Journal Systems

Directory of Open Access Journals

DIALNET

Una aproximación para resolución de ambigüedad estructural empleando tres mecanismos diferentes

Author: Bolshakov Igor Alekseevich
Galicia Haro Sofía Natalia
Gelbukh Khan Alexander Felixovitch
Publication venue: Sociedad Española para el Procesamiento del Lenguaje Natural
Publication date: 01/01/2001
Field of study

La ambigüedad estructural es uno de los problemas más difíciles de resolver en sistemas de procesamiento de lenguaje natural. Consideramos dos tipos de resolución de ambigüedad estructural que pueden emplearse en el análisis de textos sin restricciones: conocimiento léxico y cierta clase de contexto. En este trabajo, proponemos un modelo basado en tres diferentes mecanismos para revelar la estructura sintáctica correcta y un módulo de clasificación para obtener las estructuras más probables para la oración analizada. Nuestro modelo está dirigido al análisis de textos sin restricciones y las herramientas desarrolladas no requieren ninguna desambiguación de marcas morfológicas ni ningún tipo de marcas sintácticas.Trabajo hecho con apoyo parcial del CONACyT, SNI y CGEPI-IPN, México

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Secretaría de Estado de Cultura